Systems and methods for enabling two parties to find an intersection between private data sets without learning anything other than the intersection of the datasets

ABSTRACT

A system and method are disclosed for comparing private sets of data. The method includes encoding first elements of a first data set such that each element of the first data set is assigned a respective number in a first table, encoding second elements of a second data set such that each element of the second data set is assigned a respective number in a second table, applying a private compare function to compute an equality of each row of the first table and the second table to yield an analysis and, based on the analysis, generating a unique index of similar elements between the first data set and the second data set.

The present application is a continuation of U.S. patent applicationSer. No. 17/591,779, filed Feb. 3, 2022, the contents of which isincorporated herein by reference in its entirety.

TECHNICAL FIELD

The present disclosure generally relates to private data sets andenabling two parties to find an intersection between two data setswithout learning anything about the other party's data set other thanthe intersection.

BACKGROUND

The existing solutions for performing private set interactions is torequire a comparison of each data point in a first set with each datapoint in a second data set to see where they match. Thus, if there aretwo dataset of size m and n, the system will need to perform (m×n)comparisons which can be infeasible if tm and n are big numbers.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to describe the manner in which the above-recited and otheradvantages and features of the disclosure can be obtained, a moreparticular description of the principles briefly described above will berendered by reference to specific embodiments thereof which areillustrated in the appended drawings. Understanding that these drawingsdepict only exemplary embodiments of the disclosure and are nottherefore to be considered to be limiting of its scope, the principlesherein are described and explained with additional specificity anddetail through the use of the accompanying drawings in which:

FIG. 1A illustrates the issue with different private data sets and how acomparison can be done but where there is a need to maintain privacy;

FIG. 1B illustrates the approach of using multi-party computation tocompute private set intersections;

FIG. 2 illustrates a method embodiment related to computing private setintersections; and

FIG. 3 illustrates a system embodiment.

INTRODUCTION

Certain aspects and embodiments of this disclosure are provided below.Some of these aspects and embodiments may be applied independently andsome of them may be applied in combination as would be apparent to thoseof skill in the art. In the following description, for the purposes ofexplanation, specific details are set forth in order to provide athorough understanding of embodiments of the application. However, itwill be apparent that various embodiments may be practiced without thesespecific details. The figures and description are not intended to berestrictive.

The ensuing description provides example embodiments only, and is notintended to limit the scope, applicability, or configuration of thedisclosure. Rather, the following description of the exemplaryembodiments will provide those skilled in the art with an enablingdescription for implementing an exemplary embodiment. It should beunderstood that various changes may be made in the function andarrangement of elements without departing from the spirit and scope ofthe application as set forth in the appended claims.

BRIEF DESCRIPTION

Propose herein is a novel approach using a multi-party computationfunction, such as the “privatecompare” function, to determine anintersection of private sets of data. A system and method are disclosedto accomplish this task. The method includes encoding first elements ofa first data set such that each element of the first data set isassigned a respective number in a first table, encoding second elementsof a second data set such that each element of the second data set isassigned a respective number in a second table, applying a privatecompare function to compute an equality of each row of the first tableand the second table to yield an analysis and, based on the analysis,generating a unique index of similar elements between the first data setand the second data set.

An example system includes a processor and a computer-readable storagedevice storing instructions which, when executed by the processor, causethe processor to perform operations including encoding first elements ofa first data set such that each element of the first data set isassigned a respective number in a first table, encoding second elementsof a second data set such that each element of the second data set isassigned a respective number in a second table, applying a privatecompare function to compute an equality of each row of the first tableand the second table to yield an analysis and, based on the analysis,generating a unique index of similar elements between the first data setand the second data set.

This summary is not intended to identify key or essential features ofthe claimed subject matter, nor is it intended to be used in isolationto determine the scope of the claimed subject matter. The subject mattershould be understood by reference to appropriate portions of the entirespecification of this patent, any or all drawings, and each claim.

The foregoing, together with other features and embodiments, will becomemore apparent upon referring to the following specification, claims, andaccompanying drawings.

DETAILED DESCRIPTION

Disclosed herein is a new system for using a multi-party computationfunction, such as the “privatecompare” function, to determine anintersection of private sets of data. Various embodiments are disclosedto accomplish this task.

FIG. 1A illustrates an issue with respect to comparing private datasets. The private data sets 100 shown include a data set A with a firstentry of the word “Hi” 102 and a second entry of the word “And” 104.Other data is shown down to the m^(th) value. Data set B includes afirst value of “And” 106 and a second value of “Fred” 108 withadditional data down to the n^(th) value of “Car”. If one were todetermine a intersection of these two data sets, the typical approachwould be to compare data set A's first value “Hi” 102 to every otherdata set value in data set B up to the n^(th) value of “Car”. And thendo the same for each of the 2−m values of data set A. This would resultin m×n calculations or comparisons 110 which could be a very largenumber of m and n are large numbers.

FIG. 1B illustrates different tables or datasets 120 and how they can becompared using the principles disclosed herein. A first data set S_(a)122 includes the five names of David, Joe, Sarah, Fran, and Harry. Asecond data set S_(b) 124 includes four names Harry, Maria, Kate andDavid. The process here is to use multi-party computation to compute aprivate set intersection. Multi-party computation (MPC) is acryptographic tool that allows multiple parties to make calculationsusing their combined data, without revealing their individual input. MPCworks by using complex encryption to distribute computation betweenmultiple parties. This disclosure uses in one aspect MPC and in otheraspects the comparison can be done using other tools. The way it canwork is as follows.

The first data set S_(a) 122 for a first party can have a size of m. Thesecond data set S_(b) for a second party can have a data set size of n.The first step is to encode the element to an integer. The encoding canuse any algorithm 126, 128 to do the encoding. The approach can use apublic table hash function to generate a unique index for the similarelements and reduce the number of comparisons. A public function (f) canoperate such that: f(David)=0, f(Joe)=1, f(Sarah)=2, f(Harry)=−3,f(Fran)=4, f(Maria)=5, f(Kate)=6. The function can be run on all of thedata in both data sets with the results shown in table 130 and 132. Notethat the parties know the public function but they do not know the otherparties' data other than the overlapping intersection data. The approachenables them to know what data they share but nothing else. The privatecompare or similar algorithm 134 can be used to compute the equality ofeach row of the two table and thus reduce the number of comparisons from20 pairwise comparisons (4×5) to 7. In this case, the parties canidentify quickly that they share the values of “David” and “Harry”. Thatis all they will find out about the other private set of data.

It is possible to get better performance if the system allows collisionsto happen in the table hash function. A collision occurs when more thanone value to be hashed by a particular hash function has to the sameslot in the table or data structure (hash table) being generated by thehash function. For two lists each of size 100,000, there can be twooptions. First, collisions are not allowed. In this case, the systemneeds to choose a big hash table like a 20 million-row table (to avoidcollisions) and the process will end up with 20 million comparisons. Ina second option, collisions are allowed. In this case, the system canchoose the hash table with 25,000 rows and might have at maximum almost12 collisions per row. The system will need to do 144*25000 comparisonsor approximately 4 million comparisons. The approach of allowingcollisions can thus improve the performance as many less comparisons areneeded.

FIG. 2 illustrates a method embodiment. The method 200 includes one ormore steps of encoding first elements of a first data set such that eachelement of the first data set is assigned a respective number in a firsttable (202), encoding second elements of a second data set such thateach element of the second data set is assigned a respective number in asecond table (204), applying a private compare function to compute anequality of each row of the first table and the second table to yield ananalysis (206) and, based on the analysis, generating a unique index ofsimilar elements between the first data set and the second data set(208).

The respective number can be an integer or a non-integer value. In oneaspect, the private compare function is applied using multi-partycomputation.

The step of encoding the first elements and encoding the second elementscan be performed using a table hash function. The table hash functioncan be known by a first party associated with the first data set and asecond party associated with the second data set.

The respective number in the first table and the respective number inthe second table can be a result of applying a public hash function toeach element in the first data set and the second data set.

In one aspect, the unique index of similar elements between the firsttable and the second table can include an intersection of the first dataset and the second data set in a manner that neither a first partyassociated with the first data set nor a second party associated withthe second data set can learn anything other than about the intersectionof the first data set and the second data set. The step of encoding thefirst elements further can include applying a public function togenerate first indices for the first data set and encoding the secondelements further comprises applying the public function to generatesecond indices for the second data set.

The private compare function can include a table hash function.

Furthermore, the encoding of the first elements and the encoding of thesecond elements can be performed using a public function and wherein theprivate compare function includes a public table hash function.

An example system can be shown in FIG. 3 below and can include aprocessor and a computer-readable storage device storing instructionswhich, when executed by the processor, cause the processor to performoperations including encoding first elements of a first data set suchthat each element of the first data set is assigned a respective numberin a first table, encoding second elements of a second data set suchthat each element of the second data set is assigned a respective numberin a second table, applying a private compare function to compute anequality of each row of the first table and the second table to yield ananalysis and, based on the analysis, generating a unique index ofsimilar elements between the first data set and the second data set.

FIG. 3 illustrates example computer device that can be used inconnection with any of the systems disclosed herein. In this example,FIG. 3 illustrates a computing system 300 including components inelectrical communication with each other using a connection 305, such asa bus. System 300 includes a processing unit (CPU or processor) 310 anda system connection 305 that couples various system components includingthe system memory 315, such as read only memory (ROM) 320 and randomaccess memory (RAM) 325, to the processor 310. The system 300 caninclude a cache of high-speed memory connected directly with, in closeproximity to, or integrated as part of the processor 310. The system 300can copy data from the memory 315 and/or the storage device 330 to thecache 312 for quick access by the processor 310. In this way, the cachecan provide a performance boost that avoids processor 310 delays whilewaiting for data. These and other modules can control or be configuredto control the processor 310 to perform various actions. Other systemmemory 315 may be available for use as well. The memory 315 can includemultiple different types of memory with different performancecharacteristics. The processor 310 can include any general purposeprocessor and a hardware or software service or module, such as service(module) 1 332, service (module) 2 334, and service (module) 3 336stored in storage device 330, configured to control the processor 310 aswell as a special-purpose processor where software instructions areincorporated into the actual processor design. The processor 310 may bea completely self-contained computing system, containing multiple coresor processors, a bus, memory controller, cache, etc. A multi-coreprocessor may be symmetric or asymmetric.

To enable user interaction with the device 300, an input device 345 canrepresent any number of input mechanisms, such as a microphone forspeech, a touch-sensitive screen for gesture or graphical input,keyboard, mouse, motion input, speech and so forth. An output device 335can also be one or more of a number of output mechanisms known to thoseof skill in the art. In some instances, multimodal systems can enable auser to provide multiple types of input to communicate with the device300. The communications interface 340 can generally govern and managethe user input and system output. There is no restriction on operatingon any particular hardware arrangement and therefore the basic featureshere may easily be substituted for improved hardware or firmwarearrangements as they are developed.

Storage device 330 is a non-volatile memory and can be a hard disk orother types of computer readable media which can store data that areaccessible by a computer, such as magnetic cassettes, flash memorycards, solid state memory devices, digital versatile disks, cartridges,random access memories (RAMs) 325, read only memory (ROM) 320, andhybrids thereof.

The storage device 330 can include services or modules 332, 334, 336 forcontrolling the processor 310. Other hardware or software modules arecontemplated. The storage device 330 can be connected to the systemconnection 305. In one aspect, a hardware module that performs aparticular function can include the software component stored in acomputer-readable medium in connection with the necessary hardwarecomponents, such as the processor 310, connection 305, output device335, and so forth, to carry out the function.

In some cases, such a computing device or apparatus may include aprocessor, microprocessor, microcomputer, or other component of a devicethat is configured to carry out the steps of the methods disclosedabove. In some examples, such computing device or apparatus may includeone or more antennas for sending and receiving RF signals. In someexamples, such computing device or apparatus may include an antenna anda modem for sending, receiving, modulating, and demodulating RF signals,as previously described.

The components of the computing device can be implemented in circuitry.For example, the components can include and/or can be implemented usingelectronic circuits or other electronic hardware, which can include oneor more programmable electronic circuits (e.g., microprocessors,graphics processing units (GPUs), digital signal processors (DSPs),central processing units (CPUs), and/or other suitable electroniccircuits), and/or can include and/or be implemented using computersoftware, firmware, or any combination thereof, to perform the variousoperations described herein. The computing device may further include adisplay (as an example of the output device or in addition to the outputdevice), a network interface configured to communicate and/or receivethe data, any combination thereof, and/or other component(s). Thenetwork interface may be configured to communicate and/or receiveInternet Protocol (IP) based data or other type of data.

The methods discussed above are illustrated as a logical flow diagram,the operations of which represent a sequence of operations that can beimplemented in hardware, computer instructions, or a combinationthereof. In the context of computer instructions, the operationsrepresent computer-executable instructions stored on one or morecomputer-readable storage media that, when executed by one or moreprocessors, perform the recited operations. Generally,computer-executable instructions include routines, programs, objects,components, data structures, and the like that perform particularfunctions or implement particular data types. The order in which theoperations are described is not intended to be construed as alimitation, and any number of the described operations can be combinedin any order and/or in parallel to implement the processes.

Additionally, the methods disclosed herein may be performed under thecontrol of one or more computer systems configured with executableinstructions and may be implemented as code (e.g., executableinstructions, one or more computer programs, or one or moreapplications) executing collectively on one or more processors, byhardware, or combinations thereof. As noted above, the code may bestored on a computer-readable or machine-readable storage medium, forexample, in the form of a computer program including a plurality ofinstructions executable by one or more processors. The computer-readableor machine-readable storage medium may be non-transitory.

The term “computer-readable medium” includes, but is not limited to,portable or non-portable storage devices, optical storage devices, andvarious other mediums capable of storing, containing, or carryinginstruction(s) and/or data. A computer-readable medium may include anon-transitory medium in which data can be stored and that does notinclude carrier waves and/or transitory electronic signals propagatingwirelessly or over wired connections. Examples of a non-transitorymedium may include, but are not limited to, a magnetic disk or tape,optical storage media such as compact disk (CD) or digital versatiledisk (DVD), flash memory, memory or memory devices. A computer-readablemedium may have stored thereon code and/or machine-executableinstructions that may represent a procedure, a function, a subprogram, aprogram, a routine, a subroutine, a module, a software package, a class,or any combination of instructions, data structures, or programstatements. A code segment may be coupled to another code segment or ahardware circuit by passing and/or receiving information, data,arguments, parameters, or memory contents. Information, arguments,parameters, data, etc. may be passed, forwarded, or transmitted via anysuitable means including memory sharing, message passing, token passing,network transmission, or the like.

In some embodiments the computer-readable storage devices, mediums, andmemories can include a cable or wireless signal containing a bit streamand the like. However, when mentioned, non-transitory computer-readablestorage media expressly exclude media such as energy, carrier signals,electromagnetic waves, and signals per se.

Specific details are provided in the description above to provide athorough understanding of the embodiments and examples provided herein.However, it will be understood by one of ordinary skill in the art thatthe embodiments may be practiced without these specific details. Forclarity of explanation, in some instances the present technology may bepresented as including individual functional blocks including devices,device components, steps or routines in a method embodied in software,or combinations of hardware and software. Additional components may beused other than those shown in the figures and/or described herein. Forexample, circuits, systems, networks, processes, and other componentsmay be shown as components in block diagram form in order not to obscurethe embodiments in unnecessary detail. In other instances, well-knowncircuits, processes, algorithms, structures, and techniques may be shownwithout unnecessary detail in order to avoid obscuring the embodiments.

Individual embodiments may be described above as a process or methodwhich is depicted as a flowchart, a flow diagram, a data flow diagram, astructure diagram, or a block diagram. Although a flowchart may describethe operations as a sequential process, many of the operations can beperformed in parallel or concurrently. In addition, the order of theoperations may be re-arranged. A process is terminated when itsoperations are completed, but can have additional steps not included ina figure. A process may correspond to a method, a function, a procedure,a subroutine, a subprogram, etc. When a process corresponds to afunction, its termination can correspond to a return of the function tothe calling function or the main function.

Processes and methods according to the above-described examples can beimplemented using computer-executable instructions that are stored orotherwise available from computer-readable media. Such instructions caninclude, for example, instructions and data which cause or otherwiseconfigure a general purpose computer, special purpose computer, or aprocessing device to perform a certain function or group of functions.Portions of computer resources used can be accessible over a network.The computer executable instructions may be, for example, binaries,intermediate format instructions such as assembly language, firmware,source code. Examples of computer-readable media that may be used tostore instructions, information used, and/or information created duringmethods according to described examples include magnetic or opticaldisks, flash memory, USB devices provided with non-volatile memory,networked storage devices, and so on.

Devices implementing processes and methods according to thesedisclosures can include hardware, software, firmware, middleware,microcode, hardware description languages, or any combination thereof,and can take any of a variety of form factors. When implemented insoftware, firmware, middleware, or microcode, the program code or codesegments to perform the necessary tasks (e.g., a computer-programproduct) may be stored in a computer-readable or machine-readablemedium. A processor(s) may perform the necessary tasks. Typical examplesof form factors include laptops, smart phones, mobile phones, tabletdevices or other small form factor personal computers, personal digitalassistants, rackmount devices, standalone devices, and so on.Functionality described herein also can be embodied in peripherals oradd-in cards. Such functionality can also be implemented on a circuitboard among different chips or different processes executing in a singledevice, by way of further example.

The instructions, media for conveying such instructions, computingresources for executing them, and other structures for supporting suchcomputing resources are example means for providing the functionsdescribed in the disclosure.

In the foregoing description, aspects of the application are describedwith reference to specific embodiments thereof, but those skilled in theart will recognize that the application is not limited thereto. Thus,while illustrative embodiments of the application have been described indetail herein, it is to be understood that the inventive concepts may beotherwise variously embodied and employed, and that the appended claimsare intended to be construed to include such variations, except aslimited by the prior art. Various features and aspects of theabove-described application may be used individually or jointly.Further, embodiments can be utilized in any number of environments andapplications beyond those described herein without departing from thebroader spirit and scope of the specification. The specification anddrawings are, accordingly, to be regarded as illustrative rather thanrestrictive. For the purposes of illustration, methods were described ina particular order. It should be appreciated that in alternateembodiments, the methods may be performed in a different order than thatdescribed.

One of ordinary skill will appreciate that the less than (“<”) andgreater than (“>”) symbols or terminology used herein can be replacedwith less than or equal to (“≤”) and greater than or equal to (“≥”)symbols, respectively, without departing from the scope of thisdescription.

Where components are described as being “configured to” perform certainoperations, such configuration can be accomplished, for example, bydesigning electronic circuits or other hardware to perform theoperation, by programming programmable electronic circuits (e.g.,microprocessors, or other suitable electronic circuits) to perform theoperation, or any combination thereof.

The phrase “coupled to” refers to any component that is physicallyconnected to another component either directly or indirectly, and/or anycomponent that is in communication with another component (e.g.,connected to the other component over a wired or wireless connection,and/or other suitable communication interface) either directly orindirectly.

Claim language or other language reciting “at least one of” a set and/or“one or more” of a set indicates that one member of the set or multiplemembers of the set (in any combination) satisfy the claim. For example,claim language reciting “at least one of A and B” or “at least one of Aor B” means A, B, or A and B. In another example, claim languagereciting “at least one of A, B, and C” or “at least one of A, B, or C”means A, B, C, or A and B, or A and C, or B and C, or A and B and C. Thelanguage “at least one of” a set and/or “one or more” of a set does notlimit the set to the items listed in the set. For example, claimlanguage reciting “at least one of A and B” or “at least one of A or B”can mean A, B, or A and B, and can additionally include items not listedin the set of A and B.

Although a variety of examples and other information was used to explainaspects within the scope of the appended claims, no limitation of theclaims should be implied based on particular features or arrangements insuch examples, as one of ordinary skill would be able to use theseexamples to derive a wide variety of implementations. Further andalthough some subject matter may have been described in languagespecific to examples of structural features and/or method steps, it isto be understood that the subject matter defined in the appended claimsis not necessarily limited to these described features or acts. Forexample, such functionality can be distributed differently or performedin components other than those identified herein. Rather, the describedfeatures and steps are disclosed as examples of components of systemsand methods within the scope of the appended claims.

Claim language reciting “at least one of” a set indicates that onemember of the set or multiple members of the set satisfy the claim. Forexample, claim language reciting “at least one of A and B” means A, B,or A and B.

We claim:
 1. A method comprising: computing, via a private comparefunction, an equality of each row of a first table generated from afirst data set on a first computing device and each row of a secondtable generated from a second data set on a second computing device toyield an analysis, wherein the private compare function comprises atable hash function that allows collisions in which more than one valuein the first data set hashes to a same slot in the first table or thatmore than one value in the second data set hashes to a same slot in thesecond table; and based on the analysis, generating a unique index ofsimilar elements between the first data set and the second data set. 2.The method of claim 1, wherein the first computing device is independentof the second computing device.
 3. The method of claim 2, furthercomprising: encoding, via a first processor associated with the firstcomputing device, first elements of the first data set such that eachelement of the first data set is assigned a respective number in thefirst table; and encoding, via a second processor associated with thesecond computing device, second elements of the second data set suchthat each element of the second data set is assigned a respective numberin the second table.
 4. The method of claim 3, wherein encoding thefirst elements and encoding the second elements is performed using thetable hash function.
 5. The method of claim 4, wherein the table hashfunction is known by a first party associated with the first data setand a second party associated with the second data set.
 6. The method ofclaim 3, wherein the respective number in the first table and therespective number in the second table are a result of applying a publichash function to each element in the first data set and the second dataset, wherein the public hash function is publicly known and used by boththe first computing device and the second computing device to generatethe unique index of similar elements between the first data set and thesecond data set.
 7. The method of claim 1, wherein the unique index ofsimilar elements between the first data set and the second data setcomprises an intersection of the first data set and the second data setin a manner that neither the first computing device associated with thefirst data set nor the second computing device associated with thesecond data set learns anything other than about the intersection of thefirst data set and the second data set.
 8. The method of claim 3,wherein encoding the first elements further comprises applying a publichash function to generate first indices for the first data set andencoding the second elements further comprises applying the public hashfunction to generate second indices for the second data set, wherein thepublic hash function is publicly known and used by both the firstcomputing device and the second computing device to generate the uniqueindex of similar elements between the first data set and the second dataset.
 9. The method of claim 1, wherein the private compare functioncomprises a table hash function.
 10. The method of claim 3, wherein theencoding of the first elements and the encoding of the second elementsis performed using a public hash function, wherein the public hashfunction is publicly known and used by both the first computing deviceand the second computing device to generate the unique index of similarelements between the first data set and the second data set.
 11. Asystem comprising: a processor; a computer-readable storage devicestoring instructions which, when executed by the processor, cause theprocessor to perform operations comprising: computing, via a privatecompare function, an equality of each row of a first table generatedfrom first data set on a first computing device and each row of a secondtable generated from second data set on a second computing device toyield an analysis, wherein the private compare function comprises atable hash function that allows collisions in which more than one valuein the first data set hashes to a same slot in the first table or thatmore than one value in the second data set hashes to a same slot in thesecond table; and based on the analysis, generating a unique index ofsimilar elements between the first data set and the second data set. 12.The system of claim 11, wherein the first computing device isindependent of the second computing device.
 13. The system of claim 11,wherein the first computing device is independent of the secondcomputing device.
 14. The system of claim 11, wherein thecomputer-readable storage device stores additional instructions which,when executed by the first processor, cause the first processor toperform operations further comprising: encoding first elements of thefirst data set such that each element of the first data set is assigneda respective number in the first table; and encoding second elements ofthe second data set such that each element of the second data set isassigned a respective number in the second table.
 15. The system ofclaim 14, wherein the table hash function is known by a first partyassociated with the first data set and a second party associated withthe second data set.
 16. The system of claim 14, wherein the respectivenumber in the first table and the respective number in the second tableare a result of applying a public hash function to each element in thefirst data set and the second data set, wherein the public hash functionis publicly known and used by both the first computing device and thesecond computing device to generate the unique index of similar elementsbetween the first data set and the second data set.
 17. The system ofclaim 11, wherein the unique index of similar elements between the firstdata set and the second data set comprises an intersection of the firstdata set and the second data set in a manner that neither the firstcomputing device associated with the first data set nor the secondcomputing device associated with the second data set learns anythingother than about the intersection of the first data set and the seconddata set.
 18. The system of claim 13, wherein encoding the firstelements further comprises applying a public hash function to generatefirst indices for the first data set and encoding the second elementsfurther comprises applying the public hash function to generate secondindices for the second data set, wherein the public hash function ispublicly known and used by both the first computing device and thesecond computing device to generate the unique index of similar elementsbetween the first data set and the second data set.
 19. The system ofclaim 11, wherein the private compare function comprises a table hashfunction.
 20. The system of claim 14, wherein the encoding of the firstelements and the encoding of the second elements is performed using apublic hash function, wherein the public hash function is publicly knownand used by both the first computing device and the second computingdevice to generate the unique index of similar elements between thefirst data set and the second data set.